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Abstract 

Motivated by applications suchi as the detection of sources of worms or viruses 
in computer networks, identification of the origin of infectious diseases, determin- 
ing the causes of cascading failures in systems such as financial markets, or infer- 
ring the leader in a social network, we study the question of inferring the source of 
a rumor in a network based on the information about rumor infected nodes and the 
underlying network structure. 

We start by proposing a natural, effective model for the spread of the rumor 
in a network based on the classical SIR model. We obtain an estimator for the 
rumor source based on the infected nodes and the underlying network structure - 
it assigns each node a likelihood, which we call the rumor centrality. We show 
that the node with maximal rumor centrality is indeed the maximum likelihood 
estimator for regular trees. For general trees, we find the following surprising 
phase transition: asymptotically in the size of the network, the estimator finds the 
rumor source with probability if the tree grows like a line and it finds the rumor 
source with probability strictly greater than if the tree grows at a rate quicker 
than a line. In a nutshell, our estimator is qualitatively the best possible estimator 
for these graphs. 

Our notion of rumor centraUty naturally extends to arbitrary graphs. With ex- 
tensive simulations, we establish the effectiveness of our notion of rumor centrality. 
Furthermore, we apply our estimator to identify the most powerful family in the 
15th century Florentine elite family marriage network - it indeed finds the correct 
family (i.e. the Medici) as the power center! 

1 Introduction 

In the modern world the ubiquity of networks has made us vulnerable to new types 
of network risks. These network risks arise in many different contexts, but share a 
common structure: an isolated risk is amplified because it is spread by the network. For 
example, as we have witnessed in the recent financial crisis, the strong dependencies 
or 'network' between institutions have led to the situation where the failure of one 
(or few) institution(s) have led to global instabiUties. More generally, various forms 
of social networks allow information and instructions to be disseminated and finding 
the leader of these networks is of great interest for various purposes - identification 
of the 'latent leader' in a political network, identification of the 'hidden voice' in a 
spy network, or learning the unknown hierarchy of rulers in a historical setup. Finally, 
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one wishes to identify the source of computer viruses or worms in the Internet and the 
source of contagious diseases in populations in order to quarantine them. 

In essence, all of these situations can be modeled as a rumor spreading through 
a network. The goal is to find the source of the rumor in these networks in order to 
control and prevent these network risks based on limited information about the network 
structure and the 'rumor infected' nodes. In this paper, we will provide a systematic 
study of the question of identifying the rumor source based on the network structure 
and rumor infected nodes, as well as understand the fundamental limitations on this 
estimation problem. 

1.1 Related Work 

Prior work on rumor spreading has primarily focused on viral epidemics in popula- 
tions. The natural (and somewhat standard) model for viral epidemics is known as the 
susceptible-infected-recovered or SIR model [1]. In this model, there are three types 
of nodes: (i) susceptible nodes, capable of being infected; (ii) infected nodes that can 
spread the virus further; and (iii) recovered nodes that are cured and can no longer 
become infected. Research in the SIR model has focused on understanding how the 
structure of the network and rates of infection/cure lead to large epidemicsQ,!!). This 
motivated various researchers to propose network inference techniques to learn the rel- 
evant network parameters ID, 0,161,1121, El • However, there has been little (or no) 
work done on inferring the source of an epidemic. 

The primary reason for the lack of such work is that it is quite challenging. To 
substantiate this, we briefly describe a closely related (and much simpler) problem of 
reconstruction on trees f9l,fl0|, or more generally, on graphs 111]. In this problem one 
node in the graph, call it the root node, starts with a value, say or 1. This information 
is propagated to its neighbors and their neighbors recursively along a breadth-first- 
search (BFS) tree of the graph (when the graph is a tree, the BFS tree is the graph). 
Now each transmission from a node to its neighbor is noisy - a transmitted bit is flipped 
with a small probability. The question of interest is to estimate or reconstruct the value 
of the root node, based on the 'noisy' information received at nodes that are far away 
from root. Currently, this problem is well understood only for graphs that are trees 
or tree-like, after a long history. Now the rumor source identification problem is, in 
a sense harder, as we wish to identify the location of the source among many nodes 
based on the infected nodes - clearly a much noisier situation than the reconstruction 
problem. 

1.2 Our Contributions. 

In this paper, we provide a systematic study of the question of designing an estimator 
for the rumor source based on knowledge of the underlying network structure and the 
rumor infected nodes. To begin, we present a probabilistic model of rumor spreading 
in a network based on the SIR model. On one hand this is a natural and well studied 
model for rumor spreading; on the other hand it should be thought of a good starting 
point to undertake the systematic study of such inference problems. 



2 



Following the approach of researchers working on the reconstruction problem and 
efficient inference algorithm design (i.e. Belief Propagtation), we first address the 
rumor source estimation problem for tree networks. We characterize the maximum 
likelihood estimator for the rumor source in regular trees. This estimator assigns to 
each node a likelihood which we call its rumor centrality. Rumor centrality strongly 
depends on the underlying topology of the rumor network as well as the rumor in- 
fected nodes. The notion of rumor centrality of a node readily extends to arbitrary tree 
networks. 

For arbitrary trees, we find the following surprising threshold phenomenon about 
the estimator's effectiveness. If the number of nodes within a distance d from any node 
in a tree scales like d", then for trees with a = (i.e. Une graphs), the detection 
probability of our estimator will go to as the network grows in size; but for trees 
with a > 0, the detection probability will always be strictly greater than (uniformly 
bounded away from 0) irrespective of the network size. In the latter case, we find that 
estimator error remains finite with probability 1, independent of the network size. In 
the former case (i.e. a = 0), it can be shown that for any estimator the detection 
probabiUty will go to 0. Thus, our estimator is essentially the optimal for any tree 
network. 

Motivated by these results for trees, we develop a systematic approach to utilize 
the tree estimator - the rumor centrality - to develop an estimator for general net- 
works. This is possible because in essence, under the SIR model, rumors spread along 
a (random) sub-tree of the network. We perform extensive simulations to show that 
this estimator performs extremely well. In addition, we apply our estimator to the 15th 
century Florentine elite family marriage network and are able to accurately infer the 
most powerful family in the network - the Medici family. 

2 Estimator Construction 

In this section we start with a description of our rumor spreading model and then we 
define the maximum UkeUhood estimator for the rumor source. For regular tree graphs, 
we equate the maximum likelihood estimator to a novel combinatoric quantity we call 
rumor centrality. We obtain a closed form expression for this quantity. Using rumor 
centrality, we construct rumor source estimators for general trees and general graphs. 

2.1 Rumor Spreading Model. 

We consider a network of nodes to be modeled by an undirected graph Giy, E), where 
F is a countably infinite set of nodes and E is the set of edges of the form (i, j) for 
some i and j in V. We assume the set of nodes is countably infinite in order to avoid 
boundary effects. We consider the case where initially only one node v* is the rumor 

source. 

We use a variant of the SIR model for the rumor spreading known as the susceptible- 
infected or SI model which does not allow for any nodes to recover, i.e. once a node 
has the rumor, it keeps it forever. Once a node i has the rumor, it is able to spread it 
to another node j if and only if there is an edge between them, i.e. if (i, j) € E. The 
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time for a node i to spread the rumor to node j is modeled by an exponential random 
variable Tij with rate A. We assume without loss of generality that A = 1. All r^j's are 
independent and identically distributed. 

2.2 Rumor Source Maximum Likelihood Estimator 

We now assume that the rumor has spread in G{V, E) according to our model and that 
N nodes have the rumor These nodes are represented by a rumor graph Gn{V^ E) 
which is a subgraph of G(V, E). We will refer to this rumor graph as Gn from here 
on. The actual rumor source is denoted as v* and our estimator will be v. We assume 
that each node is equally likely to be the source a priori, so the best estimator will be 
the maximum likelihood estimator. The only data we have available is the final rumor 
graph Gat, so the estimator becomes 

V = arg max 'P{Gn\v* = v) (1) 

In general, P(GAr jw* — v) will be difficult to evaluate. However, we will show that in 
regular tree graphs, it can be expressed in a simple closed form. 

2.3 Rumor Source Estimator for Regular Trees 

To simplify our rumor source estimator, we consider the case where the underlying 
graph is a regular tree where every node has the same degree. In this case, 'P{Gm\v* = 
v) can be exactly evaluated when we observe Gn at the instant when the N*^ node is 
infected. 

First, because of the tree structure of the network, there is a unique sequence of 
nodes for the rumor to spread to each node in Gat. Therefore, to obtain the rumor 
graph Gat, we simply need to construct a permutation of the N nodes subject to the 
ordering constraints set by the structure of the rumor graph. We will refer to these 
permutations as permitted permutations. For example, for the network in Figure [T] if 
node 1 is the source, then {1, 2, 4} is a permitted permutation, whereas {1,4, 2} is not 
because node 2 must have the rumor before node 4. 

Second, because of the memoryless property of the rumor spreading time between 
nodes and the constant degree of all nodes, each permitted permutation resulting in 
Gat is equally likely. To see this, imagine every node has degree k and we wish to 
find the probability of a permitted permutation a conditioned on v* — v. A new node 
can connect to any node with a free edge with equal probability. When it joins, it 
contributes k — 2 new free edges. Therefore, the probability of any N node permitted 
permutation a for any node v in Gn is 



^ ' ' kk+{k-2)'"k+{N-2){k-2) 

The probability of obtaining Gat given that v* — v is obtained by summing the proba- 
bility of all permitted permutations which result in Gat. Because all of the permutations 
are equally likely, P(GAr |f * — v) will be proportional to the number of permitted per- 
mutations which start with v and result in Gat. Because we will find it necessary to 
count the number of these permutations, we introduce the following definition: 
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Definition 1. Consider a tree T. Then R(v,T) is the number of permitted permutations 
of nodes which start with node v and result in T. We refer to R(v,T) as the rumor 
centrality of node v. 

With this definition, the hkehhood is proportional to R{v,Gn), so we can then 
rewrite our estimator as 

V — arg max P{Gn\v* — v) 

vEGn 

= arg max R{v, Gn) (2) 

Because the maximum likelihood estimator for the rumor source is also the node which 
maximizes R{v, Gn), we call this term the rumor centrality of the node v, and the node 
which maximizes it the rumor center of the graph. 

2.4 Rumor Source Estimator for General Trees 

To obtain the form of the rumor source estimator in equation we relied on the 
fact that every permitted permutation was equally likely in a regular tree. However, 
in a general tree where node degrees may not all be the same, this fact may not hold. 
This considerably complicates the construction of the maximum likelihood estimator. 
To avoid this complication, we define the following randomized estimator for general 
trees. Consider a rumor that has spread on a tree and reached all nodes in the subgraph 
Gn- Then, let the estimate for the rumor source be a random variable v with the 
following distribution. 

I>{v = v\Gn) R{v,Gn) (3) 

This estimator weighs each node by its rumor centrality. It is not the maximum likeli- 
hood estimator as we had for regular trees. However, we will show that this estimator 
is qualitatively as good as the best possible estimator for general trees. 

2.5 Rumor Source Estimator for General Graphs 

When a rumor spreads in a network, each node receives the rumor from one other node. 
Therefore, there is a spanning tree corresponding to a rumor graph. If we knew this 
spanning tree, we could apply the previously developed tree estimators. However, the 
knowledge of the spanning tree will be unknown in a general graph, complicating the 
rumor source inference. 

To begin constructing a rumor source estimator for a general graph, we first define 
the set T(G7v) to be the set of all spanning trees of the rumor graph Gn- Then, we can 
express the likelihood as a sum of likelihoods over all trees in T{Gm)- 

P{Gn\v^v*)= P{T\v* = v) (4) 

Ter(Gjv) 

We showed that for regular trees every permitted permutation of nodes was equally 
likely. We now assume this to be true for a general graph. With this assumption, the 
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V=3 




Figure 1: Illustration of variables T2 and Tj. 



likelihood of any spanning tree T given that the source is v is proportional to its rumor 
centrality R{v, T). Then the rumor source estimator v will be 



2.6 Evaluating the Rumor Centrality 

The rumor source estimators we have constructed all require us to evaluate the rumor 
centrality of a tree graph, Gn). We now show how to evaluate R{y, Gn). To 
begin, we first define a term which will be of use in our calculations. 

Definition 2. T^, is the number of nodes in the subtree rooted at node Vj, with node v 
as the source. 

To illustrate this definition, a simple example is shown in Figure [T] In this graph, 
T2 — 3 because there are 3 nodes in the subtree with node 2 as the root and node 1 as 
the source. Similarly, Tj = 1 because there is only 1 node in the subtree with node 7 
as the root and node 1 as the source. 

We now can count the permutations of Gn with v as the source. In the following 
analysis, we will abuse notation and use to refer to the subtrees and the number 
of nodes in the subtrees. To begin, we assume v has k neighbors, vi,V2, ■■■,Vk- Each 
of these nodes is the root of a subtree with T^_^,T^^, nodes, respectively. Each 

node in the subtrees can receive the rumor after its respective root has the rumor. We 
will have N slots in a given permitted permutation, the first of which must be the source 
node V. Then, from the remaining — 1 nodes, we must choose T^'^ slots for the nodes 
in the subtree rooted at vi. These nodes can be ordered in R{vi,T^_^ ) different ways. 
With the remaining N — 1 — T^^ nodes, we must choose T^^ nodes for the tree rooted at 
node V2, and these can be ordered R{v2, T^^) ways. We continue this way recursively 




(5) 



We show a practical implementation of this estimator in Section[3] 
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to obtain 

R{v, Gn) = 



N -1\ fN- l-T^^ 



T"" I \ 



'TV 

i=i 

Now, to complete the recursion, we expand each of the R{vi,T^.) in terms of the 
subtrees rooted at the nearest neighbor children of these nodes. To simplify notion, we 
label the nearest neighbor children of node Vi with a second subscript, i.e. Vij. We 
continue this recursion until we reach the leaves of the tree. The leaf subtrees have 1 
node and 1 permitted permutation. Therefore, the number of permitted permutations 
for a given tree Gjv rooted at v is 



i=l ViieT^l 
k 



=(^-i)!n^ n 



i=i Vi y..^Ty '"'J' 

J Vi 



= n ^ (6) 

ueGN " 

In the last line, we have used the fact that = iV. We thus end up with a simple 
expression for R{v, Gn) in terms of the size of the subtrees of all nodes in Gat. 



3 Evaluating the Rumor Source Estimator 

In the following sections we present algorithms for evaluating the rumor source estima- 
tor for trees and general graphs. For trees, the estimator is the rumor centrality defined 
earlier. We present a message passing algorithm to evaluate the rumor centrality of all 
nodes in a tree. Rumor centrality plays an important role in the rumor source estimator 
for general graphs. We present an algorithm for evaluating the rumor source estimator 
in a general graph using the rumor centrality algorithm for trees in combination with 
an algorithm for generating uniformly distributed random spanning trees. 



3.1 Trees: A Message Passing Algorithm 

In order to find the rumor center of a tree graph of N nodes Gjv, we need to first find 
the rumor centraUty of every node in Gjv. To do this we need the size of the subtrees 
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for all V and u in Gn- There are A'^^ of these subtrees, but we can utilize a local 
condition of the rumor centrality in order to calculate all the rumor centralities with 
only 0{N) computation. Consider two neighboring nodes u and ?; in Gjv. All of their 
subtrees will be the same size except for those rooted at u and v. In fact, there is a 
special relation between these two subtrees. 

T:^N- (7) 

For example, in Figure[l] for node 1, T2 has 3 nodes, while for node 2, has N — T2 
or 4 nodes. Because of this relation, we can relate the rumor centralities of any two 
neighboring nodes. 

R{u,GN) = Riv,GM)^^ (8) 

This result is the key to our algorithm for calculating the rumor centrality for all nodes 
in Gn- We first select any node v as the source node and calculate the size of all of its 
subtrees T!^ and its rumor centrality R{v, Gn)- This can be done by having each node 
u pass two messages up to its parent. The first message is the number of nodes in us 
subtree, which we call t^''^ ., s. The second message is the cumulative product 
of the size of the subtrees of all nodes in u's subtree, which we call p""^ ., . 

' — >parent{u) 

The parent node then adds the tu^parentiu) messages together to obtain the size of 
their own subtree, and multiply the P^^parent{u) messages together to obtain their 
cumulative subtree product. These messages are then passed upward until the source 
node receives the messages. By multiplying the cumulative subtree products of its 
children, the source node will obtain its rumor centrality, R{v,Gn)- This algorithm 
will require only 0{N) computation. 

With the rumor centrality of node v, we then evaluate the rumor centrality for the 
children of v using equation Each node u passes its rumor centrality to its children 
in a message we define as f^'^chUdiu) ■ Each node u can calculate its rumor centrality 
using its parent's rumor centrality and its own subtree size T^. The computational 
effort of this algorithm is also 0{N). Therefore, the overall algorithm obtains the 
rumor centrality of all N nodes with 0{N) computation. The pseudocode for this 
message passing algorithm is shown for completeness. 

3.2 General Graphs 

For a general graph Gn with N nodes, recall that the rumor source estimator was of 
the form 

V — arg max R{v, T) (9) 

vGGn — ' 
Ter(G«) 

where T(Gn) was the set of all spanning trees of Gn- If we consider the spanning tree 
T to be a uniformly distributed random variable in the sample space in T{Gn), where 
each tree has probability 1/\T{Gn)\, then we can rewrite the sum as an expectation of 
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Algorithm 1 Rumor Center Message Passing Algorithm 



Choose a root node w e G^v 
for u in Gjv do 
if u is a leaf tlien 

u — >parent{u) 
— >parent(u) 

else 

if u is source v then 

„dow7i A''! 
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N n 

j^childreniv) 

else 

u — >parent{u) / j ' 

j^children{u) 

11- = TT 

— >parent(u) u — >parent{u) XX J — 

j ^childr en{u) 

22- j.down j.down u~^p, " ' 



u—>child{u) parent{u)^u N—t^_^ 

end if 
end if 
end for 



the random variable R{v, T) over this uniform distribution. 

S «(">^) = inG»)i E Mil 

Ter{GN) Ter(Gjv) ' ^ 

= \T{GN)mR{v,T)] (10) 

We now need a way to evaluate the above expectation for all nodes in Gn- We ac- 
complish this using two algorithms. The first is an algorithm for generating uniformly 
distributed spanning trees utilizing a random walk on Gat lfT2l . The second is the 
previous algorithm for calculating the rumor centrality on a tree. 

To generate uniformly distributed random spanning trees, we perform a random 
walk on Gat in the following manner. The random walk starts at a random node and 
moves to any of the node's neighbors with equal probability. This random walk con- 
tinues this way on Gat until the graph is covered (i.e. until every node is reached). 

Once the random walk has covered every node in Gat, we obtain a spanning tree 
with the following construction. We call the first node in the random walk v start- For 
each node v e Gn /vstart, we add to the spanning tree the edge {w,v) which corre- 
sponds to the first transition into node v in the random walk. For example, consider a 
random walk on the graph in Figure |2] with the covering random walk node sequence 
{1, 2, 4, 2, 1, 3}. Then the generated tree will consist of edges {(1,2), (1,3), (2, 4)}, 
as indicated in the figure. The trees generated by this random walk on Gat will have 
a uniform distribution and the runtime of this algorithm is given by the cover time of 
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do 6 



Spanning 
Tree 



Figure 2: The random walk {1, 2, 4, 2, 1, 3} generated on the graph Gm as indicated 
by the sequence of arrows (left), and the resulting spanning tree (right). 



Gn, which for most graphs is 0{N log N) and for the worst graphs 0{N^) fV2\. 

Once we have generated a tree, we use the tree rumor centrality algorithm to calcu- 
late the rumor centrality for every node in the tree. We generate many trees and take the 
average of the rumor centralities for each node. The node with the maximum expected 
value becomes our estimate of the rumor source. In more detail, if we define the i*'' 
generated tree as Ti, and AI total trees are generated, the our estimator will be 
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V = are max — > R(v,Ti) 

i—1 



(11) 



4 Detection Probability: A Threshold Phenomenon 

This section examines the behavior of the detection probability of the rumor source 
estimators for different graph structures. We establish that the asymptotic detection 
probability has a phase-transition effect: for line graphs it is 0, while for trees with 
finite growth it is strictly greater than 0. 

4.1 Line Graphs: No Detection 

We first consider the detection probability for a line graph. This is a regular tree with 
degree 2, so we use the maximum likelihood estimator for regular trees. We will es- 
tablish the following result for the performance of the rumor source estimator in a line 
graph. 

Theorem 1. Define the event of correct rumor source detection after time t on a linear 
graph as Ct- Then the probability of correct detection of the maximum likelihood rumor 
source estimator, P(Ci), scales as 
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Figure 3: Detection probability for line graphs. The dotted line is a plot of 
yj2jTi'N^^I'^ and the circles are the empirical detection probability. 

As can be seen, the line graph detection probability scales as t~^/^, which goes to 
as t goes to infinity. The intuition for this result is that the rumor source estimator 
provides very little information because of the linear graph's trivial structure. 

We generated 1000 rumor graphs per rumor graph size on an underlying linear 
graph. The detection probability versus the graph size is show in Figure l3] As can be 
seen, the detection probability decays as N^^l"^ as predicted in Theorem[T| 

4.2 Proof of Theorem [1] 

In this section, we present a proof of Theorem[T] The rumor spreading in the line graph 
is equivalent to 2 independent Poisson processes with rate 1 beginning at the source 
and spreading in opposite directions. The following theorem, which is proved in the 
appendix, bounds the number of arrivals in a Poisson process in time t. 

Theorem 2. Consider a Poisson process N{-) with rate 1, and a small positive e. In a 
time t, where t is large, the probability of having less than i (1 — e) arrivals is bounded 
by 

V{N{t) < t{l - e)) < c{t + 5)i/2e-*^' 

for some positive c and some small positive 5. 

Also, in a time t, the probability of having more than i (1 + e) arrivals is bounded 

by 

V{N{t) >t(l + e)) <e-*'' 

Therefore, with high probability, after a time t, for some small e, the number of total 
nodes N, which is the sum of the arrivals in both Poisson processes, will be bounded 
by 

2t{l -e) <N < 2t{l + e) (12) 

If N is fixed, the detection probability can be easily calculated. However, we want 
the detection probability after a fixed time t. Therefore, we define the Cjv as the event 
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of correct detection given N nodes in the graph. Then we can rewrite P(Ct) as 



PiCt) = j2p{CM)pm) 

N 
2t(l+e) 

Af=2t(l-£) 

For large t, we can neglect the exponential term on the right, so the above expression 
reduces to 

2t(l+e) 

P{Ct)- P{CN)P{N\t) (13) 

JV=2t(l-e) 

We now consider to be a fixed quantity and evaluate P{Cn). 

Because of the linear structure of the underlying graph, all rumor graphs Gn with 
N nodes are isomorphic (they are all lines on length A^). For any Gn, the estimate 
for the rumor source v will be the node at the center of the line. The following lemma 
makes this more precise. 

Lemma 1. For a linear rumor graph with N nodes, label nodes a distance k from 
one side of the line as Vk- Then, if N is odd, the rumor source estimator will be node 
i'(jv+i)/2- ^ even, the rumor source estimator is either node ujv/2 or node 
with equal probability. 

To prove this, we first must evaluate the rumor centrality of a node in the line graph. 
For a node Vk a distance k from one end, the rumor centrality is 



, N-k 
i=l 3 = 1 

{k-l)\{N -k)\ 
{N-l)\ 
{k-l)\{N -I - (fc- 1))! 
iV'! 

= (14) 

We see that the rumor centrality R{vk, Gn) is just the binomial coefficient. It is known 
that this will be maximized when k' is N' /2 for even N', and when k' is {N' + 1) /2 
or [N' — l)/2 for odd N'. In terms of the original labels for the line graph, the rumor 
centrality is maximized for k = {N + l)/2 for odd N and for k = N/2 and k = 
N/2 + 1 for even N. This proves Lemma[T] 



12 



Without loss of generality, we now assume that N is odd and that the rumor source 
estimator v is node V(^n+i)/2- The detection probabiUty P{Cn) will then be equal to 
the conditional probability that v* = V(^n+i)/2 given a graph Gn. To evaluate this 
probability, we express it in terms of the rumor centrality of the nodes. 



^ P(Gjv|t;* = V^N+i)/2)'P{v* = V{N+l)/2) 
_ R{v(N+1)/2,Gn)P{v* =V(N+1)/2) 

R{v,Gn)P{v* = v) 



Now we can evaluate the detection probability. 



To simplify the expression above, we use StirUng's approximation for A'^!, 



veGN 

R{V{N+1)/2,Gn) 



(15) 



P(C;v) = 





(16) 



along with the identity 



JV 



E 



kl{N-k)\ 



= 2 



.N 



(17) 
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Then, the detection probabihty becomes 

N' 



(y2^(f) 

, / ,\ N' 



2 

-°{^ 

Now we need to convert this expression from a function of iV to a function of t. Using 
equation ( [T3] ), we obtain 

2t(l+e) 

JV=2t(l-e) 

2t(l+e) . s 

- E o(:^)p(iv|t) 

Af=2t(l-e) 



o 



1 

This complete the proof of Theorem [T| 

4.3 Geometric Trees: Non- Trivial Detection 

We now consider the detection probability of our estimator in a geometric tree, which 
is a non-regular tree parameterized by a number a. If we let n{d) denote the maximum 
number of nodes a distance d from any node, then there exist constants b and c such 
that b < c and 

< n{d) < cd" (18) 

We use the randomized estimator for geometric trees. For this estimator, we obtain the 
following result. 

Theorem 3. Define the event of correct rumor source detection after time t on a geo- 
metric tree with parameter a > Q as Ct- Then the probability of correct detection of 
the randomized rumor source estimator, P(Ct), is strictly greater than 0. That is, 

lim^inf P(Ct) > 

This theorem says that a = and a > serve as a threshold for non-trivial 
detection: For a = 0, the graph is essentially a linear graph, so we would expect the 
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Figure 4: Detection probability for geometric trees (left) vs. number of nodes N, and 
histogram for estimator error for a 100 node geometric tree with a = 1 (right). 

detection probabiUty to go to based on Theorem[T| While Theorem |3] only deals with 
correct detection, one would also be interested in the size of the rumor source estimator 
error We obtain the following result for the estimator error. 

Lemma 2. Define d{v, v*) as the distance from the rumor source estimator v to the 
rumor source v*. Assume a rumor has spread for a time t on a geometric tree with 
parameter a > 0. Then, for any e > 0, there exists a I > such that 

lim^inf P((i(i^,-D*) < I) > I - e 

What this lemma says is that no matter how large the rumor graph becomes, most 
of the detection probability mass concentrates on a region close to the rumor source v*. 

We generated 1000 instances of rumor graphs per rumor graph size on underlying 
geometric trees. The a parameters ranged from to 4. As can be seen in Figure |4] the 
detection probability remains constant as the tree size grows for strictly positive a and 
decays to for a = 0, as predicted by Theorem[3] Notice that the detection probability 
for non-zero a is close to 1 . A histogram for the geometric tree with a = 1 shows 
that the error is no larger than 4 hops. This indicates that the estimator error remains 
bounded, in accordance with Lemma |2] 

4.4 Proof of Theorem H 

In this section we present a proof of Theorem |3] This proof involves 3 steps. First, 
we show that the rumor graph will have a certain structure with high probability. This 
allows us to put bounds on T!" , the sizes of the subtrees with the rumor source as the 
source node. Then, we express the detection probability in terms of the variables T" . 
Finally, we show that with this structure for the rumor graphs, the detection probability 
is bounded away from zero. Throughout we assume that the underlying geometric tree 
satisfies the property that there exist constants b and c such that b < c and the number 
of nodes a distance d from any node, n{d), is bounded by 

< n{d) < cd°' (19) 
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Figure 5: Partitioning of geometric tree for evaluating S. 



Structure of Rumor Graphs. We wish to understand the structure of a rumor graph 
on an underlying geometric tree. To do this, we first assume that the rumor has been 
spreading for a long time t. Then, we will formally show that there are two conditions 
that the rumor graph Gt will satisfy. First, the rumor graph will contain every node 
within a distance i (1 — e) of the source node, for some small positive e. Second, there 
will not be any nodes beyond a distance t{l + e) from the source node. Figure|5]shows 
the basic structure of the rumor graph. It is full up to a distance t{l — e) and does not 
extend beyond t{l + e). We now formally state our results for the structure of the rumor 
graph. 

Theorem 4. Consider a geometric tree with parameter a on which a rumor spreads 
for a long time t, and let e = ^-i/2+i5 j^^y. ^.^^g small 5. Define the resulting rumor 
graph as Gt and Qt as the set of all rumor graphs which occur after a time t that have 
the following two properties: every node within a distance t(\ — e) from the source 
receives the rumor and there are no nodes with the rumor beyond a distance t(l + e) 
from the source. Then, 

lim P(Gt eGt)^! (20) 

t — >OQ 

To prove this theorem, we first note that every spreading time is exponentially dis- 
tributed with an identical parameter, which we assume to be 1 without loss of general- 
ity. Then after a time t, a node a distance t{\ — e) from the source having the rumor 
is equivalent to a Poisson process N{-) with rate 1 having i (1 — e) arrivals in time t. 
Theorem |2]bounds the number of arrivals in the Poisson process. 

Now, we define the following events. 

• Ei = Node i which is a distance t(l — e) from the source has the rumor 

• F = All nodes less than a distance t(l — e) from the source have the rumor 

• Ai = Node i which is a distance t{\ + e) from the source has the rumor 

• B = All nodes greater than a distance t{\ + e) from the source do not have the 
rumor 
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We begin by proving that all nodes within a distance i(l — e) of the source have the 
rumor. At a distance t{l — e) there are at most c [t{l — e)]" nodes for the geometric 
tree. With this we now apply the union bound to the probability of event F. 

/c[t(l-e)]" 

P(F) = pf Q 

^c[t(l-£)]° 

U 

1=1 

c[t(l-e)r 

1=1 

>i-c[t(i-6)]"p(£;f) 

> 1 - c^P (E^) 

Event Ef occurring means a node a distance <(1 — e) from the source does not have 
the rumor This is equivalent to a Poisson process of rate 1 having less than t{l — e) 
arrivals in time t. We can use Theorem |2]to lower bound P(£;f). 

P (Ef) < aVte-*"^ 
Using this bound, we now obtain a lower bound for P 

P{F)>1- c^P (E^) 

> 1 - acr+i/^e-*^' 

We now wish to take the limit as t approaches infinity. However, the e is dependent 
upon t, so care must be taken. Substituting in the expression for e and taking the limit 
we obtain 

Hm P (F)> nm 1 - aci^+^/^g-t^" 

> lim 1 - arf"+i/2e-(*'') 
>1 



Now we wish to prove that all nodes beyond a distance t{l + e) from the source do 
not have the rumor We will follow a similar procedure as we did for proving the first 
half of Theorem|4] At a distance t(l + e) there are at most c [t(l + e)]" nodes for the 
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geometric tree. With this we now apply the union bound to the probability of event B. 

P{B) = P 




/c[t(l+£)]° 

= i-p( u 

>i- J2 

1=1 

> l-c[i(l + e)]"P(A,) 

Event Ai occurring means a node a distance t{l + e) from the source has the rumor. 
This is equivalent to a Poisson process of rate 1 having more than t{l + e) arrivals in 
time t. We can use Theorem|2]to lower bound P {Ai). 

P{A,)<e-''" 
Using this bound, we now obtain an lower bound for P (B). 

P{B) > l-c[t(l + erP(A,) 

> l-c[t(l + e)]"e-*^' 



We now wish to take the limit as t approaches infinity. Again, we substitute in the 
expression for e and take the limit. 



lim P {B) > lim 1 - c [t{l + e)]" e" 

>oo t — >oo 



> lim 1 

> 1 



t(l+t-l/2+^)' 



This completes the proof of Theorem]?] 

Detection Probability in terms of . Our rumor source estimator is a random 
variable v which takes the value v with probability proportional to R{v, Gf ). The con- 
ditional probability of correct detection given a rumor graph Gt will be the probability 
of this estimator choosing the source node v*, which is P{v — v*\Gt)- We showed 
that all rumor graphs will belong to the set Qt with probability 1 for large t. Therefore, 
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we lower bound the probability of correct detection P(Ct) as 

liminf P(Ct) = liminf ^ P(^r = -y* |G't)P(Gt) 



Gt 



>liniinf I P{v^v*\Gt)] 

limMPiGt eGt) 

> liminf inf P(v = v*\Gt) 
t GteGt 



We see that the detection probability is lower bounded by the infimum of the con- 
ditional detection probability P(w = v*\Gt) over Gt G Gt- Next, we express the 
detection probability in terms of the size of the subtrees T^^ . 

lim inf P (Ct )> liminf inf P(d=v*\Gf) 

t t GtEGt 

> liminf inf ^ ' 



t G,ee, R{v,Gt) 
veGt 



> liminf inf 



n {^u' 

ueGt 



v&Gt ViGGt 

> liminf inf ( V TT — I (21) 

\veGt Vi^Gt / 

The structure of rumor graphs in Qt will allow us to bound the sizes of subtrees whose 
source is node v* (T^ ). Therefore, if we can express P{v — v*\Gt) in terms of , 
we will be able to bound the detection probability. 

In order to evaluate the detection probability for a general tree, we must relate T^. 
to T^. . We have already seen that when node v is one hop from v* , all of the subtrees 
are the same except for those rooted at v and v* . In fact, we showed that for a graph 
with N total nodes, 

T^, =N -Tf (22) 



For a node v one hop from v*, the product in equation 21 becomes 



TT ^ = " " (23) 



11^ (-241 
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When V is two hops from v*, all of the subtrees are the same except for those rooted 
at V, V* , and the node in between, which we call node 1. Figure [6] shows an example. 
In this case, the product in equation[2T|becomes 

XL 'TV 'TV rpyrpv \ / 



rpv rpv 

-'-1 



{N ~Tf){N -Tf) 



(26) 



Continuing this way, we find that in general, for any node w in Gf, 



rpv rpv 

IT 'TV ^ n - T""') ^^'^^ 

v,eGt v,£V{v',v) V Vi ) 



where 'P{v* , v) means any node in the path between v* and v, not including v*. The 
detection probability of the rumor source estimator is then 

liminf P(Ct) > liminf 

(rpv* 
1+ y rr , 
veGt/v viev(v,v) ^ '"i 

> liminf inf — 

t GtGSt S 

We call the resulting summation S and will need to upper bound it in order to get a 
lower bound on the detection probability. 

Upper Bounding S. In this section we will show that the sum S has a finite upper 
bound. We start with an underlying geometric tree with parameter a > 0. We then 
assume we have a rumor graph Gt with iV nodes which belongs to Gt- To evaluate the 
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detection probability, we must upper bound the sum 



s^i+ y rr , . (28) 

^ 11 (N-T'") 

We know from Theorem |4] that after a time t the graph will be full up to t(l — e 
with e — as before. We will now divide Gt into two parts as show in Figure 

The first part is the portion of the graph within a distance i(l — e) from the source and 
not including the source, and is denoted Go- The remaining nodes will form graph Gi. 
We can then break the sum S into two parts. 

rpv* 

^-1+ E n 



^ 11 {'N~T'"\ 

V TT 

^ 11 (M — TV) 

S = l + So + Si 



First we will upper bound 5*0. To do this, we must first count the number of nodes 
in Go, which we will call Nq. We know that there are nodes a distance d from the 
source. By summing over d up to t{\ — e) we obtain the following bounds for A^q- 

t(l-£) t(l-e) 

y bd" <No< J2 "^^^ 

a+l - a+1 
No'"' <Nq< Ar™""= 

We have approximated the sum by an integral, which is valid when t is large. Now, 
we must calculate A^i, the number of nodes in Gi. To do this, we note that from 
Theorem|2j there are no nodes beyond a distance <(1 + e). Therefore, using the integral 
approximation again for the sum, we obtain the following bounds for iVi 

fa + l 

b ((1 + e)"+i - (1 - e)"+i) < A^i < 

c^((i+.r^-(i-er^) 

^2e(a+l)t"+i ^ ^2e(a + 

a + l ~ ^ ~ a+l 
2bet°'+^ <Ni< 2cet°+i 
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We used the first order term of the binomial approximation for (1 ± e)""*"^ above. Now 
we rewrite So in a more convenient notation. 

5o = E n uSf^ ^^^^ 



E 

veGo 



n (30) 



^ bv (31) 



Now, to upper bound 5*0, we group the 6„ according to the distance of v from v*. We 
denote as the maximum value of by among the set of nodes a distance d from the 
source. Then we can upper bound 5*0 as 

t(l-e) 



Now, to calculate a^, we first must evaluate the w^. term in equation ( |30| l. To do 
this, we consider a node Vi G Gq a distance i from the source. For this node, we upper 
bound the number of nodes in its subtree by dividing all Nq nodes in Gq among the 
minimum bi" nodes a distance i from the root. Then , to this we add all A^i nodes in 
Gi to get the following upper bound on T^. 

Nn 

With this, we obtain the following upper bound for Wy. 

rpv* 



< 



i" No 
1 



< Cl 



1 



1 2ce(a + l) 



H'^ b{l-e)»+\ 

The constant ci is equal to (1 — 1/6)^^. Now, we write down an upper bound for Sq, 
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recalling that e = 4-^2+*. 

t(l-e) 

t(l-£) d 



Ct=l «=1 ^ v / / 

^ 2cd-y^+'{a + l) \ 



t(l_t-l/2 + ^) 



In the last line, we used the fact that d <tto upper bound the product. 

We define the terms in the above sum corresponding to a specific value of das A^. 
Then, we use an infinite sum to upper bound this sum. 

t(i-t-i/=^+^) 

^0 < Yl ^d 

00 
d=l 

If we apply the ratio test to the terms of the infinite sum, we find that 

lim sup = lim sup f -r—r 

d Ad-i d \d-l 

( 1 2cd-i/2+^(a+l) 



crf« 6(1 _ d-i/2+<5)«+i 



= 



Thus, the infinite sum converges, so Sq also converges. Now we only need to show 
convergence of Si. 

We upper bound S\ in the same way as we did for Sq. We write the sum as 

^^=E n tSt^) ^'^^ 
= 12 n (33) 

fed viePiVjv) 

= E n "^-^ n (34) 

veGi viePiv ,v),vieGo viePiv ,v),vieGi 

= E I II ) (35) 

veGi \vieP(v* ,v),vieGo ) 
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To upper bound Si, we group the by according to the distance of v from the top of Gi. 
We denote as the maximum value of by among the set of nodes a distance d from 
the top of Gi. We also denote the upper bound of the product of Wy^ over nodes in 
P{v* , v) and Gq as F. Then we can upper boimd Si as 



Si<J2 

veGi 

2te 



d=l 



Now, to calculate a^, we upper bound the Wy^ for nodes in Gi. We assume that every 
subtree in Gi has size Ni. Then, similar to our procedure for ^o, we upper bound the 
weights Wy^ for the nodes in Gi. 



rpv 

7).- 

Wy 



< 



Ni 
N-Ni 
Ni 



< 

- No 

< ^ ■ 

^ 2ce(a + 1) 

- 6(1 - e)«+i 

Recalling that e = t~^/'^^^, we upper bound 5*1 as 

2ie 



d=l i=l 

2^1/2+5 



frl \b{l-t-^/^+s)<^+^ J 
//—I ^ 



6(1 - d-l/2+5)a+l 



2(1/2 + 5 

< E 



Above we have used the relation that d < t. Similar to what was done for ^o, we 
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upper bound this sum with an infinite sum. 



2tl/2+i 

d=l 
oo 

d=l 

If we apply the ratio test to the terms of the infinite sum, we find that 

Bd f d y 2cd-i/2+5(a+l) 

hm sup = lim sup ' ' 



Bd-i d \d-lj 6(1 - d-V2+5)o+i 
= 

Again, the ratio test proves convergence of the sum Si. 

We have now shown that the sum S = 1 + 5o + 5i is upper bounded by some finite 
iS** . With this, we can lower bound the detection probability for the geometric tree. 

lim inf P (C* ) > lim inf inf — 

t t Gt&Qt S 

1 

> — 
- 

> 

This completes the proof of Theorem[3] 

4.5 Proof of Lemma m 

We utilize Theorem |3] to prove Lemma |2] First, we rewrite the distribution of the 
estimator v on a rumor graph Gt formed after a rumor has spread for a time t. 

R{v,Gt) 

P{v — v) — 



veGt 

R{v,Gt)/R{v*,Gt) 

^ R{v,Gt)/R{v*,Gt) 

veGt 



where p{v, Gt) is defined as follows using equation 27 



ViG'P(v*,v) 
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We recognize the sum of p{v, Gt) over all v in Gt as the sum S which was previously 
shown to converge to a positive constant S*. Now, let d{v, v*) be the distance between 
the rumor source estimator and the rumor source. We can write the probability of the 
estimator error being greater than / hops as 



Pid(v,v*) > l\Gt) 



v:d{v.v*)>l 

veGt 

E 

v:d{v.v*)>l 

" s 

We select an e > and define ci = e^. Then, because of the convergence of the sum 
S, there exists anl > such that 

E Pi^^Gt)<ei 

v:d{v.v*' )>l 

< eS 

Now, using this result along with Theorem |4] we find the limiting behavior of the 
probabihty of the error being less than I hops: 

liminf P(c?(u, u*) < I) ^ I - limsupP(d(u, u*) > /) 
* t 

= 1 — lim sup 
t 

E P{d{v,v*) > l\Gt)P{Gt) 

GteGt 

E '°("'^*) 

v:d(v.v*)>l 

> 1 — lim sup 

t J 

limsupP(Gt e Gt) 

t 

> 1 — lim sup — 

t J 

> 1 - e 

Thus, for any positive e, there will always be a finite I such that the probability of the 
estimator being within / hops of the rumor source is greater than 1 — e, no matter how 
large the rumor graph is. 



5 Simulation Results for General Graphs 

This section provides simulation results for our rumor source estimators on two general 
graphs: a simulated grid graph and a real network. For the grid graph, several random 



26 




50 80 100 " Q ^ 2 3 

N Estimator Error [hops] 



Figure 7: Example of a 100 node rumor graph on a grid (top). Detection probability 
for the grid graph vs. number of nodes TV (bottom left). Histogram of the estimator 
error for a 100 node rumor graph (bottom right). 

rumor graph instances were generated on the underlying grid and the statistics of the 
rumor source estimator were collected. The real network we used is the marriage 
network of elite families in 15th century Florence. We find that our estimator performs 
extremely well for both networks. 

5.1 Grid Graphs. 

Grid graphs are not trees, so we must utilize the general graph rumor source estimator. 
We generated 100 instances of rumor graphs per rumor graph size on an underlying grid 
graph. To calculate the expectation value in equation[TO] 1000 trees were generated per 
rumor graph. Figure |7] shows an example of a 100 node rumor graph on a grid. In 
this case, our estimator was able to find the rumor source exactly. Next is a plot of the 
detection probability of the estimator versus rumor graph size. We find that for rumor 
graphs with up to 100 nodes, the detection probability does not go to 0. Finally, we 
show a histogram of the estimator error for a 100 node rumor graph. As can be seen, 
we never obtain an error greater than 3 hops. This empirical data indicates that the 
general graph estimator should have good performance on general graphs. 
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Figure 8: The 15th century Florentine elite family marriage networkfTSl. The dark- 
ened node is our estimate of the rumor source, which in this case is the Medici family. 
This family is also the true center of power in this network. 

5.2 Florentine Marriage Network: A Future Application 

In order to see if our estimator can be applied to situations beyond finding rumor 
sources, we used it on the marriage network of elite families in 15th century Florence. 
This is a well known network in the social science literature. The links in this network 
represent a marriage between families. It is known that the Medici family wielded the 
most power and so was effectively the center of the network [13 1. Even though there 
was no rumor spreading, our estimator found, rather surprisingly, that the Medici fam- 
ily was the source of this network. This indicates that our estimator may do more than 
just determine the rumor source. It may also indicate which nodes are important or 
influential in a network. The Florentine marriage network can be seen in Figure [8j 



We constructed estimators for the rumor source in regular trees, general trees, and 
general graphs. We defined the maximum likelihood estimator for a regular tree to be 
a new notion of network centrality which we called rumor centrality. We used rumor 
centrality as the basis for estimators for general trees and general graphs. 

We analyzed the asymptotic behavior of the rumor source estimators for line graphs 
and geometric trees. For line graphs, it was shown that the detection probabihty goes 
to as the network grows in size. However, for geometric trees, it was shown that 
the estimator detection probability is bounded away from as the graph grows in 
size. Simulations performed on synthetic graphs agreed with these tree results and 
also demonstrated that the general graph estimator performed well. The general graph 
estimator was also able to predict the most powerful family in the 15th century Flo- 



6 Conclusion and Future Work 
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rentine elite family marriage network. This indicates that this estimator may be able to 
find influential nodes in networks in addition to finding rumor sources. 

There are several future steps for this work. First, we would like to develop estima- 
tors when the spreading times are not identically distributed. Second, we would like to 
create a message passing algorithm for the general graph estimator in order for it to be 
applicable to distributed environments. Third, we would like to test our estimators on 
other real networks to accurately assess their performance. 



7 Proof of Theorem |2] 

To prove the bound for <(1 — e) arrivals in a Poisson process iV(-) of rate 1, we first 
write down the exact probability of this event 

V{N{t)<t{\-e)) = e-' ^ ^ 

1=0 

Next, we upper bound the sum by noting that its terms are monotonically increasing. 
To see this, we take the ratio of consecutive terms. 

_ i 

{i-iy.ti ~ t 

This ratio is less than \ifi<t, which is true for the sum. Therefore, we upper bound 
the sum by taking all terms equal to the largest term. 

t(l-^) t{l-e) 

P{Nit) < t{l - e))<e-* j: ^^^^-^ 

^t(l-e) 

We apply Stirling's approximation to the factorial in the denominator to obtain 

(<(1 - e) + l)e-*t*(i-') 



FiNit) < til - e)) < 



^27ri(l - e)t{l - e)*(i-<=) 



<aVt+^e-*(^+(i-^)^°s(i-0 (36) 

where we have defined a and 6 as 



1 - e 
1 
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Now, in order to simplify the exponent, we approximate log(l — e) as — e for small e. 
Inserting this into equation (|36]l we obtain the first part of Theorem|2] 

P{N{t) < t{l - e))<aVtTSe-'^'-^^-'''>' 
<aVt + Se'*"^ 

To prove the bound on t{l + e), we use the Chernoff bound. For a 6 > 0, we have 



P{N{t) > t{l + e)) < 6-"'^^+'^^ 
For a Poisson process, the above expectation is 



JNit) 



E 



We insert this into the Chernoff bound to obtain 

P{N{t) > t{l + e)) < e-*[»(i+')+(-'-i)] 

To obtain the tightest possible bound, we maximize the expression inside the brack- 
ets in the exponent. The maximum is achieved for 6 = log(l + e). Using e as an 
approximation for log(l + e), we obtain the second result of Theorem |2] 



P{N{t) > t{l + e)) < e 
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